Predicting the failure of two-dimensional silica glasses

Being able to predict the failure of materials based on structural information is a fundamental issue with enormous practical and industrial relevance for the monitoring of devices and components. Thanks to recent advances in deep learning, accurate failure predictions are becoming possible even for strongly disordered solids, but the sheer number of parameters used in the process renders a physical interpretation of the results impossible. Here we address this issue and use machine learning methods to predict the failure of simulated two dimensional silica glasses from their initial undeformed structure. We then exploit Gradient-weighted Class Activation Mapping (Grad-CAM) to build attention maps associated with the predictions, and we demonstrate that these maps are amenable to physical interpretation in terms of topological defects and local potential energies. We show that our predictions can be transferred to samples with different shape or size than those used in training, as well as to experimental images. Our strategy illustrates how artificial neural networks trained with numerical simulation results can provide interpretable predictions of the behavior of experimentally measured structures.

We have three different set of parameters in the SVM analysis: The parameters for the symmetry functions (δ, µ), the affine strain and the hyper-parameter of the support vector machine. δ is chosen as 0.1Å. µ lies between 0.2Å and some upper bound µ up in steps of 0.2Å. Atoms 2.5 A larger than the upper bound away from the central atom are neglected. µ up as well as the affine strain are varied along a set of possible values listed in table 1 . The SVM is optimized with respect to its regularization parameter C and two kernels (linear, radial basis function). The investigated parameter choices for C and can be seen in Table 1 as well as the investigated parameter range of kernel width γ for the radial basis function kernel. To investigate whether feeding symmetry functions of the initial configuration has any benefits, we train models also with just the features from the affine transfomred state and compare them with models trained from features of the initial and affine deformed state.
For every combination of δ, µ, , the SVM hyper-parameter are optimized via five fold cross validation on the training set, retraining for the optimal SVM parameters and judge the final mode by its performance on the test set. We perform and 80/20 split to generate the training and test set (total samples 913/910/737 for disorder levels 0.2/0.3/variable). As SVM do not scale well computationally with the size of the training set, we have to down-sample the training set to perform the training. We perform two different subset selections. For the first subset we choose all atoms which are part of the first bond breaking and an equal number of atoms from the rest of the population thus creating a balanced training set. For the second subset we again choose all atoms which are part of the first bond breaking and add enough atoms from the remaining (unbroken) population to reach final size of 10000 training samples thus generating an unbalanced training set. The weights are adjusted to rebalance the training set.
The biggest difference in model performance can be seen for the different construction of the training set ( Supplementary Figure 2 a, e and i) where the balancing leads to a shift from overall correct predictions to the percentage of captured plastic events. For samples of variance 0.2 it can also be seen that the optimal kernel for models trained on the unbalanced training set is always the radial basis function kernel. The other model parameters are less obvious in terms of model impact. Models trained on symmetry functions of the initial undeformed and the affine deformed state come closer to the desired case of all correct predictions, the differences are small to models trained just on the affine deformed state (

ResNet50
InputLayer ZeroPadding2D Conv2D MaxPooling2D GlobalAveragePooling2D 128x128 64x64 32x32 16x16 8x8 4x4 Supplementary Figure 3: Scheme of ResNet50 architecture. A scheme of Resnet50 architecture from Keras library has been obtained by kerasvisualization package. We have presented a schematic representation ignoring some layers. The input dimension is at first reduced to 64 pixels and then processed and subsequently reduced through the convolutional blocks (parts of ResNet50 architecture with the same dimension). ResNet50 can take as input any image with dimensions bigger than 32x32. The output layer of ResNet is a vector of 2048 dimensions (obtained by a global pooling of a tensor of 4x4x2048 dimensions) which as been passed in a fully connected layer to predict the disorder, the strain and the first bond break location. In our work we have considered the convolutional blocks with 32x32, 16x16, 8x8 and 4x4 dimensions. The model prediction (green cross), its confidence intervals (green ellipsis), and the real rupture location (red dot) are shown on top of two non-strained images used in the machine learning prediction. b) Cumulative probability of the total error e r , for different disorder levels (see Methods for details). c, d) Scatter plot of the confidence intervals σ px , σ py , computed from the different predictions obtained from data augmentation, versus the absolute prediction errors |e x |, |e y |. The panels show that when the confidence interval is small, the prediction error tends to be small as well. e, f) Ratio of prediction error versus confidence interval, showing that for almost all samples the error is equal to or less than the confidence interval (one standard deviation of the predictions over data augmentation).
Supplementary Figure 7: Correlation between crack path and rupture location. Crosscorrelation between the x coordinate of the first broken bond and the average x coordinate of the crack path. The color code represents the disorder s 2 .

ResNet50 + UpSampling
InputLayer ResNet50 Conv2D UpSampling2D 128x128 128x128 Supplementary Figure 8: Scheme of ResNet50 architecture combined with upsampling layers. A scheme of the architecture inspired by colorization model is presented. At first the input image is passed trough a ResNet50, which is discuused in Supplementary Figure 3, producing a tensor of 4x4x2048 dimensions. This tensor is then passed through upsampling and convolutional layer until the input dimension is restored. This architecture allows to predict images. In our work we have used this architecture to predict the image of the fractured silica configuration. Supplementary Figure 9: Crack path prediction. The first 36 samples in the test set of the full crack path prediction task, see Supplementary Figure 5 for details (notice that samples are assigned to train/test sets randomly, not sequentially). For each panel, the fracture atoms are shown as magenta dots, and the model prediction as a black-to-green background, with greener area corresponding to more likely crack positions according to the model.

Window center, x [pixels]
Predicted strain, s(x) Supplementary Figure 10: Transfer learning: predicting location from a strain-trained model. First 20 samples of the strain-to-location task, where the strain-trained model is used to predict the rupture strain of different regions of a larger sample. By sliding a square window over different parts of the sample (horizontal axis), different rupture strains are predicted (vertical axis). The real rupture location is marked as a vertical dashed line. The figure shows that, in most cases, the rupture location tends to be on a region of lower predicted rupture strain.